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ABSTRACT 

A methodology j$ on. enhancing the significant spectral fcatuA.es in 
Lands at data Is introduced. 

The process, by which significant spectral featuAcs axe determi- 
ned, uses a minimum entAopy model to guide subsequent : analysis 
efforts . 

Classification results using traditional and minimum entropy me- 
thod are presented and discussed. 


1. INTRODUCTION 

The Landsat satellite system was designed to provide a global e- 
arth observation system capable of delivering digital tape data 
and photography to a wide audience of researchers as well as pra_ 
dicing resource managers. The widespread adoption of Landsat da 
ta as a useful source of information for on-going earth resour- 
ces programs has made the original concept a high success . 

This very success has created a large and growing community of a 
sers who need sophisticated pattern recognition techniques, but 
who themselves are not prepared to personally develope and refi- 
ne the required techniques'. This new "breed" of user, often a 
practicing environmental resource manager or an Individual of si 
milar training and experiences, generally has limited time and 
funds to support computer aided analysis of Landsat data. 
Nevertheless, he wants soplrlstlcated analysis that can be done 
quickly and cheaply. Thus there is a need, as well as an opportu 
nlty, to join technique developers with users to improve those 
techniques that effectively transform the raw data Into useable 
information at the lowest possible speed and cost. This study re 
ports, on the efforts of just such a partneshlp, bringing toge- 
ther two Individuals with differing yet complementary interests 
to efficiently translate Landsat data Into Information readily u 
s cable by the varied user community of the Landsat system. The 
problem addressed In this study, at least In an introductory way, 
was how to improve the process of spectral feature selection, 
that is, locating the spectral training sets which are to compri 
se the prototypes for .classification of the entire data 
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Set. 


2 . the Mimm-Emopy concept 

Entropy is a statistical measure oh uncertaii'ity . At the. onset o £ 
a Landsat data analyst d ehhort t one is confronted with. considera- 
ble uncertainty as to the numbe r and kind oh spectral classes, 
that are contained on the data tape. In fact, consider an ensem- 
ble oh potential spectral classed, the optimum h eature selection 
mode mil be that which chooser ^eatuAes which minimize the en- 
tropy oh this ensemble. Since this is equivalent to minimizing 
the dispersion oh the various pattern populations, it is reasona- 
ble to expect that such a feature selection mode , will have cluste 
ring properties. This concept can be e{{eotively used in the de- 
sign oh an optimum feature selection process within a pattern re 
cognition system [reier to Tou and Heydorn, 196?) . 

Consider a pattern recognition system which is designed to reco- 
gnize K pattern classed. 

Por each')oattern class, the feature selection process within the 
system will determine the set oh discriminatiig features which 
are necessary ior a correct recognition oh these classed. 

A44LUTJC. that each oh the K pattern population id characterized by 
a normal probability density ‘function and the covariance matri- 
ces, describing the statistics oh the K pattern classes, are e- 
qual . 

Let, hor Landsat satellite data: 

: a matrix oft "n" pixels in 'V spectral bands. 
Thid matrix is a pattern oh l-th training set, 
where l£K. 

: a matrix oh "n" pixels in p$m images trans hor 
med. 

: a trans hormation matrix oh the spectral bands . 
The columns oh this matrix are the heat are 

vectors oh the pattern classes. 

The method employed generates a linear trans hormation. matrix , A . 
(or "p" hectare vectors) which. operates on X to yield a net'/’ 


(n?m) 

[n V ,p) 

Imtpl 
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matrix . y . , so that the. intraset dispersion (entropy) 06 
is minimizbS. This trasu>6o motion may be written ah: 

i A 





irn^pj 


16 one assumes a multivariate normal distribution 6or .each pat- 
tern population, this 6u.no.tion is characterized by its mean vec- 
tor and covariance matrix which is, in turn, characterized by its 
eigenvalues and eigenvectors. These eigenvectors carry the in6or 
motion describing the properties 06 the patterns under consideta 
tion. However, some 06 these eigenvectors bear less in6ormatidn, 
in a pattern recognition sense, than others, and may be ignored. 
In 6 aCL t would be desireabte to use a method witch provides 6or 
the selection 06 only the most sigiu.6icairt 6eature vectors. Such 
a method is possible sluice the entropy 6^rction 06 , Y \ is mi 
nimized when we select "p n eigenvectors associated wlzhF''p v srnat 


• test eigenvalues by forming the trans6ormation matrix 
6er again to Tou + Heydom, 1967 , or Watanabe, 7 9 69 ) . 




3 . PROPERTIES OP THE MINIMUM- ENTROPY TRANSFORMATION 
The main properties 06 this method are: 

a. The reduction in dimensionality 06 the patterns. 

In 6act, the minimization 06 the esvtropy 6 u rctlon implies the 
mathematical idea 06 in6ormation compression over the coordina- 
te system so tlvxt most 06 the random patterns are coszceslrated 
on a 6&w coordinates instead 06 widely distributed among all 06 
them. 

b. The ortho normality o 6 the 6 ( ^rt ii/i es and the trans6ormed inage. 
This is due to the 6act that the primary vectors are the eigen- 
vectors 06 a real syinmetric matrix (covariance matrix) . The or- 
thonomaltty imp-ties that the images irons 6ormed are uncorrela- 
ted. Note, however, that 6or Landsat satellite data there is so_ 
me redundancy in the information contest between contiguous 
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bancU> . 

c. Rank ordering oh the h^tures as a function oh their relative 
discriminant importance. 

In ^a.c£ this rank Li made. aco.oA.ding to the descending order oh 
the associated eigenvalues. Stnce it is possible to demonstrate, 
see i.e. Kendall {1972), that there exists an equivalence betwe- 
en the values oh the eigenvalues and the variances oh the ^ecitu. 
res, the resulting h^rtures will contain ,: hor its low variance, 
the maximum possible discriminating inhormation concerning the 
pattern classes. 

In relation to the pAoperties discussed above and the physical 
characteristics oh the Landsat spectral bands, it's possible to de 
hire a vector oh ho^mexi by the 1st eigenvector, associa 

ted with the smallest eigenvalues, hor the pair contiguous spec 
tral bands : 

4 (O.S - 0.6 / im ) and 5 (0.6 - 0.7 pm) 

5 (0.6 - 0.7 jam) and 6 (0.7 - 0. S pm) 

6 (0.7 - O.S //im) and 7 (0.8 - 1.1 jum) 

This vector is used hor training set selection. 

4. AN EXAMPLE Of THE MINIMUM ENTROPY MOVEL AS A PPLIEV TO LANVSAT 
DATA 

Landsat data hor the border area between Lucca and Pisa provin- 
ce., the horested shcMline near lake Massaciuccole, were proces- 
sed using " standard ” joattern recognition methods, that is, com- 
monly used digital techniques h°r conducting an unsupervised clas__ 
sihication, leading to a land use type map. figure 1 shows a black 
and white. print -oh the hinal color das sihication image (map). UJa 
ter at the lower leht (Tyrrhenina Sea ) and upper right (lake Mas- 
saciuccole) are very dark,while unclassihied areas {primarily a- 
gricultural land types) are depicted as ohh~white. 

Three classes oh woodland are depicted in the center oh the scene 
as light, medium and dark grey. An assessment oh the correctness 
oh tlis classihication map Lite three woodland classes was • veri- 
fied using black and white aerial photography at a. -scale oh 
1:13,000. 

Thus , this classihication correctly discerns three horcst types 
that vary in their density (oh trees), age (tree heiglvt and width), 
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FIGURE I - F^ c£aiAi^*cat^n -image (woodland, wcUeA) deAived 
item t/uuLuUonai technique.. 
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and/or composition (e.g. ratio trees to brush) . 

W/iot ^6 tmpon.ta.nt to note, moreover is the fact that in ma 
ny cases the "accuracy" of the classification as determined by 
the use ! t will very often be Subjectively determined after viewing 
the final classification map [image ) . With the more traditioi'ial 
approach to processing and analysing Landsat data, as was used to 
make this classification, such an image is not available until 
the very end of the processing jorogram. 

The use of the minimum entropy model to prepare a transformed ima 
ge,that is made available early in the processing program, is an 

iiitial advantage over traditioivxl procedures, figure 2 ( also a 

black and white priit of the color original) show just such an i- 
mage for a siigthly larger area than that depicted in .Figure 1 . 

figure 2 shorn more of the Tyrrhenian Sea on -the left and lake Mas 
saciuccolein the upper center of the image. In the center, grey~ 
levels indicate the varying same spectral diversity amongst the 
wooded shoreline as loos shown in figure 1. However, this image is 
made, available to the user early in the processing program [refer 
to figure 3), and can be used immediately as a guide to training 
set selection . 

Employment of the niinunum entropy model also provide the user 
with a very powerful analytic tool to complement his subjective re 
actions to the 1st transformed image. 

As was suggested earlier it is very important to fcrocecd • in a 

recognition program with training sets that are different and re- 
presentative of the entire scene. 

The hierarchical classification method, applied to the feature vec 
tors (refer, again, to figure 3), provide the analyst/ user with 
a convenient means to satisfy these conditions, that is, diffe- 
rent yet representative. 

The hierareliieal classification is represented diagranmaticaly in 
figure 4. This figure summarizes the relationships between every 
pair of groups (trailing sets, entire scene, and cumulative beha- 
viour of training sets) in the form of a dendrogram. These relation 
ships are expressed in terms of correlation coefficents so that a 
statistical threshold, set by the user, can be employed. 

What can be seen in Figure 4(a) is that the five spectral classes 
(3 woodland, 2 water) are different, but they are not representa- 
tive of the entire scene. Why? 

Refer back to figure 1 . Mote that this scene includes unclassi- 
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FIGURE 2 - Hirumm-uibiopy toaniio/unUon colo-l-compoUtc image. 







ANALYST 


r 


USER 


Landsat input, oi | 
spectral. band pajj\ 
4 - 5 , 5 - 6 , 6-7 i 


Tnansionmation oi each 
pain oi band* via mini- 
mum entnopy method 


Images associated | 
uUth the smaiiesti 
eigenvalues Ion 1 
. variances ) 



Figune 3 - Scheme oi the procedune ion training set selection. 
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GROUPS 


THRESHOLD VALUE 
5% LEVEL OF SI 
GNIF1CANCE 
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FIGURE 4 - Example, o q dandsiogtiamA obtained a^toA the. Itlexsvic.lu.cctf. 
cLaAAL^'i.c.cuCLOn . 

(a) A It t/LcUiu.ng az£& cuit dd.^z/itwt but not A.npn. 2 ,i>mta 
two. oi the. ojvtbiQ. Acena. 

(b) Alt inaliiing 6nt6 cma dl^QAmt and Aap^eidniative. 

thd mtoie. &cm<L. 
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iied aneas which, one depicted in an oii-white colon. That thelvlc- 
nanchical clast iiien it nepontiig connectly, that the thn.ee wood 
land and two waten classes, while diHenevt, one not nepnesentati 
ve oi the entine scene. In Figune 4(b), whene the dimension oi 
the scene have been neduced to exclude unclattiiied aneas ( 0 ^- 
wlute) , we iind that the classes one now both diHenent and nepne 
tentative oi the eutine scene. The feedback loop . chanactenizing 
this jonocess, which provides sets o{ ieatunes ion neview by the u 
ten. f at thown in Figune 3, it eatily and napidly penfaonmed by the 
computen. 


5. CONCLUSION 

A pnoven pattenn necognition pnocedune, the minimum entnopy model, 
hat been employed ion. jonocessing a tmall pontion oi Landtat data. 
TlvLs trdal wo s conducted to investigate the impact oi the model 
upon the tpeed, clanity and accuracy oi the iinal nesults when 
compared with a mone tnaditioivzi approach. The Authont have iound 
that the trinimum entnopy model may pnovide tevenal uteiul advanta 
get oven tnadition techniques. Fintt, total computen time to con- 
duct a complete pattenn necognition pnocest it neduced. Second, 
tubj'ective (tnan&ionmed image), as well at statistically denived , 

inionmatcon one made available to the analytt/uten much eanlien in 
the analysis pnocest. A napid ieedback loop in which numenout tna 
ining tot combination can be tested ion diHenence and nepnes enta 
tiveness is available. 

Additional tests oi Landtat data facetting using the minimum en- 
tnopy model one cleanly juttiiied. Data sets oq diHening tpec- 
tnal composition, dnawn inom uthen aneat and othen season, should 
be evaluated beione genenal conclusion and necommendations . 
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